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ON  SOME  NUMERICAL  PROPERTIES  OF  ARMA  PARAMETER  ESTIMATION  PROCEDURES 


II.  Joni'ph  Newton 
Institute  of  Statistics, 
Texas  A&M  University 


Abst^a« 

This  paper  reviews  the  n lp,<»r  1  thms  iise«l  by  ntnt  Lstlclans  for 
obtaining  efficient  estimators  of  the  parameters  of  a  univari¬ 
ate  autoregressive  moving  average  (ARMA)  time  series.  The 
connection  of  the  estimation  problem  with  the  problem  of  pre¬ 
diction  Is  Investigated  with  particular  emphasis  on  the  Kalman 
filter  and  modified  Cholesky  decomposition  algorithms.  A 
result  from  prediction  theory  Is  given  which  provides  a  signi¬ 
ficant  reduction  In  the  computations  needed  In  Ansley's  (1979) 
estimation  procedure.  Finally  It  is  pointed  out  that  there  are 
many  useful  facts  In  the  literature  of  control  theory  that  need 
to  be  investigated  by  statisticians  Interested  In  estimation 
and  prediction  problems  In  linear  time  series  models. 
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decomposition  algorithm 


1,  INTRODUCTION 


nlte  order  autoregressive  process. 


Let  {e(t),CeZ),  Z  the  .set  of  integers,  be  a 
white  noise  time  series  of  uncorrelatcd  zero 
mean  random  variables  having  common  variance 
0^.  Then  the  time  series  {Y(t),t€Z)  satisfy¬ 
ing 

P  9 

Y(t)  -  -Z  o(J)Y(t-J)-k€(t)+  Z  B(k)E(t-k)  (1.1) 
J»1  k“l 

is  called  an  autoregressive  moving  average  pro¬ 
cess  of  order  p  and  q  (ARMA(p,q)).  If  p=0  then 
Y(-)  Is  a  moving  average  process  of  order  q, 
MA(q),  while  If  q-0  Y(-)  Is  an  autoregressive 
process  of  order  p,  AR(p). 

Defining  a(O)-0(O)"l  and  the  complex  valued  poly¬ 
nomials 

P  1  '>  k 

g(*)«  Z  a(J)z^  ,  h(z)-  Z  B(k)z  , 

J“0  k“0 

we  can  write  (1.1)  as 

P  9 

E  a(J)Y(t-J)-  E  B(k)c(t-k) 
j-0  k>0 

or 

g(L)Y(t)-h(L)c(t), 

where  L  Is  the  lag  or  back  shift  operator, 

L^Y(t)-y(t-J) ,  JeZ.  If  the  zeros  of  g(-)  are 
outside  the  unit  circle  then  Y(-)  is  stationary, 
l.e.  It  can  be  written  as  an  Infinite  order 
moving  average  process,  while  If  the  zeros  of 
h(-)  are  outside  the  unit  circle  then  Y(*)  Is 
Invertible,  l.e.  It  can  be  written  as  an  Infi- 


The  ARMA  model  has  been  very  useful  In  znaly- 
zlng  time  scries  data.  Given  a  sample  reali¬ 
zation  yTc(y(1) , . . . ,Y(T))  from  Y(')  one  seeks 
T  ^  ^  -  T*** 

estimators  o=(a(l) ,... ,a(p))  ,  §-(b(1),.-. 

B(q))‘,  and  o^  of  the  parameters  ct,§,  and  o^, 

as  well  as  memory-t,  horlzon-h,  minimum  mean 
square  error  linear  predictors  and  prediction 

variances  Y(t-l-h|t),  o?  -E{Y(t+h)-Y(t+h|  t) 

C » tl 

of  Y(t-l-h)  given  Y(l) , . . .  ,Y(t)  for  a  variety  of 
memories  and  horizons  t“tj^, . . .  .t^,  h“hj^, . . .  .hj. 

Thus  Y(t+hlt)»l3  .Y  where  1^  .  minimizes 
S(p-E{Y(t-l-h)-£’'Y|.}2  , 

and  o^  .“S(l  .  ) .  This  gives  (see  Whittle  (1963) 

t,h  't,h 

p.47)  that  \  .  satisfies  the  normal  equations 

-t,h 


‘’Y,t-t,h"5t,h 


”t,h"'‘Y^°^"-t,h  ’’Y,tJt,h’ 
where  r^  (R^  (t+h-1) , . . .  ,Ry  (h))  with  Ry(v)“ 
E(Y(t)Y(t+v)) ,veZ,  and  r„  Is  the  (txt)  Toeplltz 

*  t  c 

covariance  matrix  of  Y/,  l.e.  T.,  .  has  (J,k)th 
*t  I  j  t 

element  RY(k-J).  We  can  thus  define  ^  by  Its 
first  row;  ^-T0EPL(R^(0) . ^^(t-l)). 


The  usual  assumption  made  for  doing  estimation 
and  prediction  In  ARMA  models  Is  that  c(‘)  (and 
thus  Y(*))  is  a  Gaussian  process.  Thus  the  maxi- 


wim  likcllliooil  c<)t  InKitnrn  a,B,  and  mnxlmtze 
the  Goiinnlnii  llkrllhninl 

Ka,  §. o2 /Y^)-  (?  n ) I ■'»exp . 
while  Y(t+h|t)  and  .  are  the  conditional  mean 

C  *  n 

and  variance  of  Y(t+h)  given  Y(l) , . . . ,Y{t) . 

In  tlila  paper  we_l)  review  nttemptn  hy  ntntlnCi- 
ctans  to  obtain  a.^,  and  (or  estlmatora 

aaympto* Ically  equivalent  to  them),  2)  show  how 
recent  methods  are  closely  connected  with  finding 
predictors,  and  3)  propose  an  Improvement  of 
Ansley's  (1979)  estimation  procedure.  This  pro¬ 
cedure  Is  currently  regarded,  at  least  In  many 
situations,  as  the  most  numerically  efficient 
available.  Finally  wo  will  be  able  to  compare 
the  two  most  popular  methods  available. 

2.  APPROXIMATE  METHODS 

There  have  been  three  basic  types  of  procedures 
used  to  estimate  the  parameters  of  the  ARMA  model: 
1)  estimators  that  are  derived  hcurlstlcally  and 
then  shown  to  be  asymptotically  equivalent  to  the 
HLE,  2)  estimators  obtained  by  maximizing  a  func¬ 
tion  asymptotically  equivalent  to  the  likelihood 
L,  and  3)  directly  maximizing  L.  The  procedures 
have  evolved  as  both  computing  software  and  hard¬ 
ware  have  improved. 

As  an  example  of  a  type  1  procedure  we  consider 
Hannan's  (1969)  method.  The  method  consists  of 
1)  finding  consistent  Initial  estimators 

and  2)  performing  an  alternating  proced¬ 

ure  of  the  form 

3)  combining  to  obtain  asymptotically 

efficient  B,  6)  8  o,  and  5)  possibly  Iterat¬ 
ing  the  process  by  returning  Co  (2)  with  a  re¬ 
placing  at**). 

The  Initial  estimators  and  are  obtained 

by  noting  the  following  facts  about  the  ARMA 
■^el: 

P 

1)  t  a(J)R^(J-v)-0  ,  v»q+l, . . .  ,q+p  (2.1) 

J-0  ^ 

2)  By  writing 

g(L)Y(t)-h(L)c(t)-X(t),  (2.2) 

we  note  that  X(-)M<A(q,B.q^).l.*e.  X(-)  la  an 
MA(q)  process  with  parameters  8  and  o^.  Thus 

by  the  first  equality  In  (2.2)  we  have 
P 

K(y)-  t  <i(j)a(k)R^(l+v-k),vcZ,  (2.3) 

*  J.k-0  ^ 

while  froai  the  second  equality  we  have 


R^(v)- 


q-lv| 

I  B(k)8(k->-|v|)  ,  lv|_<q 
k-0 

,  0  ,  lv|>q 


(2.4) 


Then  defining  the  consistent  sample  autocovarl- 
ances 

Rt(v)-|  ^'i''^(t)Y(t+(vl)  ,  |v|  <  T  , 

t-1 

is  found  by  solving  (2.1)  with  R^(v) 
replacing  R^Cv),  while  Is  obtained  by  first 

using  and  R.j(*)  In  (2.3)  to  obtain  con¬ 

sistent  estimators  R^(®) (0) , . . . (q)  which 
are  then  used  In  (2.4)  to  get  8^®)  via  Wilson's 
(1969)  or  Bauer's  (1955)  algorithm. 


Steps  2  and  4  of  Hannan's  method  are  performed 
in  the  frequency  domain  by  noting  from  (2.2) 
that  the  spectral  density  functions  1^(0  > 

fjj(-).  and  f^(-)  of  Y(-),  X(-),  and  e(-)  are 

related  by  (since  fj(w)«  o*/2»  >  we(0,2x1) 

|8(e^”)|?f^w)-  |h(e^'')|*  (2.5) 


Then  we  can  rewrite  (2.5)  ss»  dropping  arguments 
for  convenience. 


2ir 


ThP 


(2.6) 


^Y  gZ  1 

ThF‘^  TIP 

(2x/gZ)Z|g|Zf^  j 

Ihp  2s  Thp 


(2.7) 


(2.8) 


Thus  the  right  hand  sides  of  (2.6), (2.7),  and 
(2.8)  are  In  the  form  of  an  autoregresalva  spect¬ 
ral  density,  and  thus  for  v>p. 


q  2s  |g|Zf 

^  8(J)/  -77- 

J-0  0  ^X 


Y 


,t(J-v)w 


dv  ■ 


«  ^ 
V 


(2.9) 


2s  f. 


^  “(J^/  ThR  * 
j-0  0 


y  .i(j-v)w 


dw 


6  gZ 

V 


q  2s 

I  B(J)/ 

J-0  0 


(2.10) 

(2.11) 


Now  f^  can  be  estimated  by  the  perlodogrom 

't<'>  •  i  I  f.,  • 

|v|  <T 

and  since  X(*)  -  MA(q,  8.  0*)  we  have  that 


Then  for  example,  In  nn  obvious  notnrion,  fhe 
step  -►  Is  done,  by  solving  (2.9)  with 

f.j,  and  replacing  g,  and  and 

the  fast  Fourier  transform  used  to  c.ilculnte  a 
rectangular  sum  approximation  to  the  Integrals 

needed.  Then  (2.10)  Is  used  for  6^^^  •*  and 

B  a  while  (2.11)  Is  used  for  -►  6^^'. 

Finally  the  combination  of  B^'^  and  Is  done 

using  an  analogy  with  two-stage  least  squares. 

In  a  pivotal  article,  Akalkc  (1973)  noted  that 
Hannan's  method  was  the  same  ns  one  step  In 
directly  maximizing  L  using  the  Nrwton-Raphson 
procedure  with  nn  approximation  to  the  Hessian 
matrix  of  L.  This  observation  led  many  re¬ 
searchers  to  turn  their  attention  to  other  pos¬ 
sible  approaches  to  directly  maximize  L.  We  will 
consider  two  of  these  In  the  next  section. 

As  an  example  of  a  type  2  estimation  procedure 
we  consider  the  method  of  Box  and  Jenkins  (1970). 

T  -1 

In  large  samples  the  term  expC-ljY^rY  .j,Y.j,)  domi¬ 
nates  the  term  jr^  .^1  Thus  Box  and  Jenkins 
suggest  maximizing 

L'(o,  B,  o2|y.j,)  -  exp(- 

which  can  be  done  in  a  nonllneat  tegresslon 
framework.  Thus  It  can  be  shown  that 
T 

Tt^y!t?t  ■  ^ 

where  (eft))  «  E(e(t)|Y.j,,  a,  8).  Thus  the  Box- 

Jenklns  procedure  consists  of  replacing  -«■  by 
-Q  In  (2.12)  and  approximating  [c(t)]  by  back- 
forecasting. 

It  is  generally  agreed  that  In  the  case  of  large 
T  and/or  zeros  of  h(z)  far  from  the  unit  circle 
that  estimation  methods  of  type  1  and  type  2 
give  very  reasonable  results.  However  if  one 
Is  faced  with  a  situation  where  tlie  above  con¬ 
ditions  are  not  satisfied,  then  the  numerical 
properties  of  the  procedures  nullify  the  bene¬ 
fit  of  their  simplicity.  Thus  Iterations  often 
fall  to  converge  In  a  reasonable  time  and  .sys¬ 
tems  of  equations  that  must  be  solved  become 
highly  Ill-conditioned.  Tlius  there  has  been  a 
need  for  more  numerically  stable,  exact  maxl- 
mus  likelihood  methods. 

3.  EXACT  MAXIMUM  LIKELIHOOD  .PRIXIEDURES 

With  the  advent  of  Iterative  nonlinear  optimi¬ 
zation  procedures  which  require  only  a  starting 
value  and  evaluation  of  the  function  to  be 
optimized  for  given  values  of  Its  arguments 
(see  Dennis  and  More  (1977),  for  cx.imple) ,  the 
most  recent  procedures  for  ARMA  parameter  esti¬ 
mation  have  centered  on  evaluating  L  for  given 
values  of  a,  B,  and  o^. 


In  this  ‘ii’rt  Ion  we  consider  two  such  methods 
for  evaluating  t:  1)  using  the  Kalman  filter 
algorithm  nnd  7)  using  covnrlnnce  matrix  de¬ 
composition  methods,  particularly  the  Cholesky 
decomposition  algorithm.  The  basic  purpose  of 
these  algorithms  Is  to  obtain  the  one  step  ahead  - 

prediction  errors  c(t)  -  Y(t)  -  Y(t|t  -  1)  and 
prediction  variances  -  oJ_^  1'  ^  ^ . 

el nee  one  con  easily  show  that 

|r,,  _|  -  n  a^,  -  E  . 

Y,T  t  .T  Y.T.T  o^ 

To  atimmarlzc,  the  two  methods  currently  regarded 
as  most  numerically  efficient  for  maximizing  L 
consist  of  two  stages; 

1)  Find  Initial  estimators  as  described  above 
for  Hannan's  method. 

2)  Use  an  Iterative,  derivative  free  nonlinear 
optimization  algorithm  to  find  the  maximum 
likelihood  estimators,  using  either  the 
Kalman  filter  algorithm  (Akalke  (1976), 

Harvey  and  Phillips  (1979),  Jones  (1980), 
Pearlmnn  (1980),  or  Gardner,  Harvey,  and 
Phillips  (1980),  for  ex.imple)  or  the  Cho¬ 
lesky  decomposition  algorithm  (see  Pagano 
and  Parzen  (1973),  Pagano  (1976),  Phadke 
and  Kedem  (1978),  Ansley  (1979),  Newton 
(1980),  Newton  nnd  Pagano  (1981),  for  ex¬ 
ample)  to  find  the  e(t)  and  o^  needed  to 
evaluate  L. 

Kalman  Filter  Algorithm 

Consider  the  following  two  equations: 

Y^^j  “  ^t^t  *  ^t^t  Equation) 

X^^j^  «  (observation  equation) 

where  the  Y's  are  unobservable  N-vectors,  the 
X's  are  observable  M-vectors,  H^,  G^,  and  are 

known  (NxN),  (NxL) ,  and  (MxN)  matrices,  and  U^, 

are  Independent  zero  mean  L  and  N  dimensional 

white  noise  series  with  known  covariance  matrices 
and  R^. 

Then  given  initial  v.iluea  Y^  and  Pq  “  varlY^) 
one  finds  the  T^'s  and  P^  ■  cov(Y^)  by  the  Kal¬ 
man  filter  algorithm 

-t+l  "  -t+1  ■  ''t+l^®t+l?t+l  ■  ?t+l^ 

^+l  “  ”tn  -  ’'tM^+lVl 

!t+i  ■  ”t!t'  ^+l  ■  Vl^m 


‘<St+lVlSt+l  ^  "t+l) 


“tn  •  "t^"t  *  '^t^t^ 


A  nianber  of  authors  In  Che  statistical  lltera- 


turc  hnvf'  polnlrH  out  tlwit  n  'ilmplo  /ipplfrntlon 
of  this  Algorithm  to  AKMA  motJcls  ran  be  made  to 
obtain  the  e(t)  and  o?.  •Thun  alnce 

t(t+J)|t+l)  -  Y(t+J|t)  +  gjr.(t+l), 

J  -  I,  ....  m  -  1 
P 

L  a(k)Y(t-H»-k|  t)  +  g  r.(t4-l), 
k-1  " 

I.  J  -  B 

J-1 

where  g,  “  1  and  g,  -  B(J-l)  +  j:  a(k)g.  .  ,Jl2 
1  J  k-1  J 


and  n  -  iiiax(p,  q+l).  we  can  write  the  equation 
of  atate 

!t+i  ■  ”  !t  • 

where  g^  -  (g^.  gj . g^) ,  and 

H  -  f  0  1  0  ...  0  1 

0  0  1  ...  0 


0  0  0 
a(ni)  . 


...  0(1) 


where  a(J)  -  0  If  J  >  p. 

Further,  since  Y(t-*l|t+l)  ■  YCt+l),  we  can  write 
the  observational  equation 

^41  -  ^t4l  ^  "t4l  • 
where  -  Y(t+1),  -  (1,  0,  ...,  0),  and 

la  an  observational  error  randan  variable. 

Cholesky  Decomposition  Algorlthn 

An  (nm)  synnctrlc  matrix  Is  positive  defi¬ 
nite  If  and  only  if  It  can  be  written 

n°A  „  •  <5.2) 

n  Agfl  AfM  Agfl 

where  L.  Is  a  unit  lower  triangular  matrix  and 
A  gd 

^  la  a  diagonal  matrix  with  positive  diagonal 

elements.  This  is  the  modified  Cholesky  decom¬ 
position  of  A  .  We  note  that  we  can  also  write 

”  T 

the  Cholesky  decomposition  A  -  M.  M.  where 
t  n  A,n  A,n 

M.  -  L,  D.  but  we  will  consider  exclusively 
A,n  A,n  A,a 

the  modified  decomposition. 

The  decomposition  (3.2)  Is  unique  and  nested, 

— ’  *'A,J’  '^A.j’  *'a!j’  ®a!j  ^5*J)  princi¬ 
pal  minors  of  I.,  ,  D,  ,  L,  ^  ,  and  D,^'  respec- 

Agfi*  Agd  Agd  Apii  ^ 

tlvely  for  J  £  n.  Thus  we  can  denote  the  (],k)th 
clement  of  L,,  and  D„  In  the  decomposition  of 

the  (KxK)  covArlance  matrix  ^  of  a  time  scries 


Y(*)  nn  hy  .  .  nnrf  D  .  .  respectively  for  t  £  K. 

If  Y(*)  la  a  purely  nondetermlnlstlc  covariance 
stationary  time  series  with  spectral  density 
function  t(*),  l.e. 

.-?x 

o^  -  2x  exp{^  /  log  f(w)dw}  >  0  , 

-  2x  J) 

then  e(l)  -  Y(l),  while 
t-1 

e(t)  -  Y(t)  -  r  Iv  ,  ,  ve(t-k),  t  >  2 

k-1 

"t  “  ^Y.t.t  ’ 

and  in  fact  (Newton  and  Pagano  (1981)) 
t-Hi-l 


Y(t+h|t) 


^Y,t+h,t+h-k*^*^"'‘^ 


h-1 

,2  .  -  r  l2 


°t,h  "  ^^Q^,t+h,t4h-k®Y,t+h-k,t+h-k 
Further, 


11m  L  I-  -  B  (J),  J  >  0 
Y,K.K-j 


where  the  B„(-)  are  the  coefficients  In  the  MA(») 
representation  of  Y(*).  Wlille  these  facts  ap¬ 
pear  to  be  known  In  the  literature  of  control 
theory  they  do  not  appear  to  exist  In  the  sta¬ 
tistical  literature,  at  least  In  the  general 
form  of  equations  (3.3)  through  (3.6). 

Now  major  simplifications  of  these  equations 
occur  when  Y(»)  is  an  ARMA  process.  Thus  (Pa¬ 
gano  and  Parren  (1973),  Pagano  (1976),  Newton 
(1980))  If  Y(-)  Is  an  MA(q)  we  have  that  J  k 

-  0  If  |j-k|  >  0  and  thus  Ly,,“0  1fj-k 

>  0.  Further,  Pagano  and  Parzen  (1973)  state 
In  an  attempt  to  find  Y(t4h|t)  and  o^  ^  that  If 

Y(>)  *  ARMA(p,  q,  a,  B>  o^)>  and  one  forma  the 
P  *  ' 

aeries  X(t)  »  z  o(j)Y(t-j),  t  >  p,  then 
j-0 

X(p4l),  ...,  x(T)  la  a  realization  from  an 
^(9>  §>  process  and  one  can  combine  the 
MA(q)  prediction  algorithm  with  an  AR(p)  pre¬ 
diction  algorithm.  Ansley  (1979)  makes  this 
same  transformation  of  Y(*)  to  X(*)  to  get  the 
eft)  and  oj  necessary  to  evaluate  L. 

The  methods  of  Pagano  and  Parzen  (1973)  and 
Ansley  (1979)  can  be  suimnarlzed  theoretically 
by  combining  the  equations  (3.3)  through  (3.6) 
above  with  the  following  theorem  due  to  Newton 
and  Pagano  (1981). 

Theorem  • 

Let  Y(*)  -  AItMA(p,  q,  o,  6.  o*)  and  Z(*) 

*  AK(p,  a,  o^)  with  associated  covariance  matrix 

sequences  -  L^^^ 

®Z.t4,f 


Sr.t  "  • 

®y.t  “  °x.t  • 

where  -  (X(l) . X(t))^  - 

element 


X(J)  -  Y{J) 

J-1 


.  J  -  1 


r  a  (k)Y(J-k)  ,  J  -  2 . . 

■  k-0 
P 

I  a(k)Y(J-k)  .  J  >  p 

k-0 

where  the  are  easily  obtained  by  perform¬ 

ing  the  Levinson  (19 M)  recursion  for  J  -  p-1, 
....  1,  with  -  n(k); 


Ojd)  - 


1  <  J  <  p. 


1  -  aj^^^(J+l) 


i»-max(l.J-p,-»-l 

1 

*  r  o^_,a-^)RY(^-B),l,Jil 

£-max(l,l-p)’’ 


°t,h  “  '’x,t+h-k,t+h-k 


k-0  " 

t+h  ”1 

Thus  to  find  tlie  one  step  abend  prediction 
errors  nnd  vnrlnnces  one  need  only  compute  ttie 
q  nonzero  elements  of  the  successive  rows  of  Lj^ 

until  the  convergence  properties  (1.7)  nnd  (1.8) 
take  effect.  Further,  one  can  then  also  find 
the  Y(t+h|t)  from  these  same  quantities  and  the 
e(*).  Note  however  that  to  find  .  one  also 


needs  to  find  the  elements  of  L_ 


These  can 


be  found  simply  by  noting  that  tlie  upper  (pxp) 

principal  minor  I,  can  be  found  by  inverting 
^  *P 

the  lower  triangular  matrix  L_  ,  while 

Z,p 

‘‘Z.J  k  "  J  i  >'  i  P  • 

where  y(0)  -  1,  yCU,  y(2),  ...  are  the  coef¬ 
ficients  In  the  MA(”)  representation  of  Z(*). 
Further,  the  elements  in  rows  p+1 ,  ...  in  the 
first  (p-1)  columns  of  L^  sre  obtained  by 
P 


Z,p+J  ,k 


-  -  r  a(r)L 


Z,p+J-r,k 


,  1  £  k  <  p. 


I  8(k)6(k+|l-j|), 
k-0 

(l-j|  19  l.J  >  P 


|0  It  1  1  J  £  p,  1  >  p,  and  1  -  J  >  q 
or  If  l,j  >  p  and  (l-j|  >  q 


Z.P+J ,k 


1  1  k  <  p  . 


Thus  the  pnidlctors  and  prediction  variances 

Y(t+hlt)  and  o^  can  be  obtained  using  either 
t  ,h 

the  Kalman  filter  algorithm  or  the  Cholcsky  de¬ 
composition  algorithm.  Also  the  convergence 
properties  described  In  the  Cholcsky  algorithm 
can  be  Incorporated  Into  the  Kalman  filter  al¬ 
gorithm. 


tti*  Ljj  ,  ■  B(j),  J  -  1,  ...,  q  (3.7) 

®X.K,K  “  '''■  (3-8) 

K-H-  ’ 

Thus  e(l)  -  X(l),  while 

«ln(q,J-l) 

«(t)  X(t)  -  Z  K  f  r  c«(t-k),  t  >  1 

"t  •  ®x.t.t  ' 

and  In  fact 

p 

Y(t+h|t)  -  X(t+h|t)  -  Z  o(j)Y(t+h-j|t)  , 


h  ■  1 . . 


4.  DISCUSSION 

Thus  we  have  seen  that  algorithms  developed 
recently  for  finding  maximum  likelihood  esti¬ 
mators  of  ARMA  process  parameters  have  essenti¬ 
ally  consisted  of  applying  established  algor¬ 
ithms  (or  finding  the  minimum  mean  square  error 
linear  one  step  ahead  predictors  and  prediction 
variances.  We  have  shown  how  the  Cholcsky  al¬ 
gorithm  c.in  be  used  to  find  more  than  one  step 
ahead  predictors  nnd  variances  as  well.  We  note 
that  both  tlie  Kalman  filter  and  Cholcsky  decom¬ 
position  algorithm  can  be  extended  easily  to 
the  multivariate  ARMA  ease.  Wo  also  note  that 
we  have  not  attempted  to  survey  all  the  recent 
work  In  the  ARMA  estimation  area,  but  rather 
have  emphasized  those  algorithms  that  appear  to 
be  most  widely  used  In  the  literature. 

We  con.sldcr  next  the  relative  speed  and  stabil¬ 
ity  of  the  Kalman  filter  and  Cholcsky  decom¬ 
position  algorithms.  Pearlman  (1980)  shows  that 


the  numl)er  of  opcrof Ions  needed  to  find  the  c(t) 
and  via  the  Kalman  filter  algorithm  Is  ap¬ 
proximately  T(2p  t  3m  +  3),  with  m  -  ninx(p,q+l), 
while  for  the  Cholesfcy  algorithm  It  is 
T(p  +  \(q+l)(q+A)),  Thus  If  q  2.  5  the  Kalman 
algorithm  la  faster. 

Now  the  fact  that  the  Kalman  filter  algorithm 
performs  a  number  of  "matrix  squaring"  oper- 

T 

tlons  (for  example  (3.1)  has  enused 

many  Investigators  to  question  Its  stability. 
However  a  number  of  authors  have  suggested  meth¬ 
ods  for  avoiding  these  operations  (see  Paige 
(1976)  for  example).  The  Cholesky  decomposition 
Is  well  known  for  Its  numerical  stability  (see 
Wilkinson  (1967)  for  example).  However  more 
work  Is  needed  before  a  final  conclusion  can  be 
made  about  which  of  the  two  methods  Is  to  be 
preferred. 

finally  we  point  out  that  there  are  a  variety 
of  numerical  methods  for  analyzing  linear  time 
series  models  In  the  work  of  control  theorists 
particularly  In  a  series  of  papers  of  Kallath 
(see  the  references  In  Aasnaes  and  Kallath 
(1973))  that  need  to  be  Incorporated  Into  the 
statistical  literature. 
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